An $N$ player normal form game consists of:
[2]
We have:
In [1]:
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
plt.figure()
ys = [0, 1]
plt.plot(ys, [30 * y + 6 * (1 - y) for y in ys], label="$u_r((1, 0), (y, 1-y))$")
plt.plot(ys, [12 * y + 30 * (1 - y) for y in ys], label="$u_r((0, 1), (y, 1-y))$")
plt.legend()
plt.xlabel("$y$")
plt.title("Utility to row player");
[1]
In [2]:
plt.figure()
xs = [0, 1]
plt.plot(xs, [12 * x + 42 * (1 - x) for x in xs], label="$u_c((x, 1 - x), (1, 0))$")
plt.plot(xs, [30 * x + 12 * (1 - x) for x in xs], label="$u_c((x, 1 - x), (0, 1))$")
plt.legend()
plt.xlabel("$x$")
plt.title("Utility to column player");
The points of intersection:
For the row player:
In [3]:
import sympy as sym
x, y = sym.symbols('x, y')
sym.solveset(sym.Eq(24 * y + 6, 30 - 18 * y), y)
Out[3]:
For the column player:
In [4]:
sym.solveset(sym.Eq(42 - 30 * x, 18 * x + 12), x)
Out[4]:
[1]
The best responses are then given by:
$$ \sigma_r^* = \begin{cases} (0,1)&\text{ if }y< 4/7\\ (1,0)&\text{ if }y> 4/7\\ \text{indifferent}&\text{ otherwise} \end{cases} $$$$ \sigma_c^* = \begin{cases} (1,0)&\text{ if }x< 5/8\\ (0,1)&\text{ if }x> 5/8\\ \text{indifferent}&\text{ otherwise} \end{cases} $$[1]
$u_c((x, 1 - x), (0, 0, 1))=22x+6(1-x)=6+16x$
In [5]:
plt.figure()
xs = [0, 1]
plt.plot(xs, [12 * x + 42 * (1 - x) for x in xs], label="$u_c((x, 1 - x), (1, 0))$")
plt.plot(xs, [30 * x + 12 * (1 - x) for x in xs], label="$u_c((x, 1 - x), (0, 1))$")
plt.plot(xs, [16 * x + 6 for x in xs], label="$u_c((x, 1 - x), (0, 0, 1))$")
plt.legend()
plt.xlabel("$x$")
plt.title("Utility to column player");
[2]
We see that the third column is never a best response and so the best responses remain unchanged.
[4]
For a two player game $(A, B)\in{\mathbb{R}^{m\times n}_{>0}}^2$ the row/column player best response polytope $\mathcal{P}$/$\mathcal{Q}$ is defined by:
$$ \mathcal{P} = \left\{x\in\mathbb{R}^{m}\;|\;x\geq 0; xB\leq 1\right\} $$$$ \mathcal{Q} = \left\{y\in\mathbb{R}^{n}\;|\; Ay\leq 1; y\geq 0 \right\} $$[2]
For a nondegenerate 2 player game $(A, B)\in{\mathbb{R}^{m\times n}_{>0}}^2$ the following algorithm returns a nash equilibrium:
[4]
The following is code used to carry out the various pivoting operations (students would do this by hand).
In [6]:
import numpy as np
import nashpy as nash
row_tableau = np.array([[12, 42, 1, 0, 0, 1],
[30, 12, 0, 1, 0, 1],
[22, 6, 0, 0, 1, 1]])
col_tableau = np.array([[1, 0, 30, 6, 36, 1],
[0, 1, 12, 30, 18, 1]])
Let us drop label 0:
In [7]:
dropped_label = nash.integer_pivoting.pivot_tableau(row_tableau, column_index=0)
row_tableau
Out[7]:
We have picked up label 3, let us drop label 3 from the column tableau:
In [8]:
dropped_label = nash.integer_pivoting.pivot_tableau(col_tableau, column_index=3)
col_tableau
Out[8]:
We have picked up label 1, let us drop label 1 from the row tableau:
In [9]:
dropped_label = nash.integer_pivoting.pivot_tableau(row_tableau, column_index=1)
row_tableau
Out[9]:
We have picked up label 2 so let us drop label 2 from the col tableau:
In [10]:
dropped_label = nash.integer_pivoting.pivot_tableau(col_tableau, column_index=2)
col_tableau
Out[10]:
The row tableau has labels: $\{2, 3\}$ and the column tableau has labels: $\{0, 1, 4\}$ thus we have a fully labeled vertex pair.
The vertex pairs are given by:
$$ \left((900/33480, 18/1116), (24/828, 540/24840, 0)\right) $$which gives Nash equilibria:
$$ \left((5/8, 3/8), (4/7, 3/7, 0)\right) $$[7]
This confirms the previous result as this corresponds to a pair of best response.
[1]
Some code to confirm the numeric calculations:
In [11]:
A = np.array([[30, 6, 36], [12, 30, 18]])
B = np.array([[12, 30, 22], [42, 12, 6]])
game = nash.Game(A, B)
game.lemke_howson(initial_dropped_label=0)
Out[11]:
Given a two player game $(A,B)\in{\mathbb{R}^{m\times n}}^2$, referred to as a stage game, a $T$-stage repeated game is a game in which players play that stage game for $T > 0$ periods. Players make decisions based on the full history of play over all the periods.
[2]
Given a two player stage game $(A,B)\in{\mathbb{R}^{m\times n}}^2$, repeated to give a $T$-stage repeated game. A strategy is a mapping from the entire history of play to an action of the stage game:
$$ \bigcup_{t=0}^{T-1}H(t)\to a $$where:
[2]
$S_1\{r_1, r_2\}$, $S_2=\{c_1, c_2, c_3\}$ and $T=2$
$$\{(\emptyset, \emptyset), (r_1, c_1), (r_1, c_2), (r_1, c_3), (r_2, c_1), (r_2, c_2), (r_2, c_3)\}$$[3]
The pure Nash equilibria:
$$A = \begin{pmatrix}\underline{1} & \underline{4} & \underline{-1}\\-1 & 0 & -2\\\end{pmatrix} \qquad B = \begin{pmatrix}-2 & \underline{5} & \underline{5}\\\underline{18} & -1 & -2\\\end{pmatrix}$$Thus, for our example we have the four Nash equilibria:
Consider the following two strategies:
For the row player:
$$(\emptyset, \emptyset) \to r_2$$ $$(r_2, c_1) \to r_1$$ $$(r_2, c_2) \to r_1$$ $$(r_2, c_3) \to r_1$$
[2]
For the column player:
$$(\emptyset, \emptyset) \to c_1$$ $$(r_1, c_1) \to c_3$$ $$(r_2, c_1) \to c_2$$
[2]
This is a Nash equilibria because:
[1]
According to the definition of a repeated game, the strategy spaces for both players is:
$$ S_1 = S_2 = \{CC, CS, SC, SS\} $$[3]
Using S for "stop" and "C" for continue, where the first letter corresponds to what the player will do at their first decision and the second letter corresponds to what the player will do at their second decision.
Using this, and reading from the diagram, the payoff matrix is obtained.
[3]
There are 4 pure Nash equilibria:
$$\{(SC, SC), (SC, SS), (SS, SC), (SS, SS)\}$$which all imply that both players stop the game at their first opportunity.
[3]
with the following constraints:
$$T > R > P > S$$$$2R > T + S$$[2]
(i)
For $A, B$ to be valid we need:
$$ \begin{pmatrix} 3 & S\\ 6 & 1 \end{pmatrix} = \begin{pmatrix} R & S\\ T & P \end{pmatrix} $$which gives: $R=3, T=6$
furthermore:
$$ \begin{pmatrix} 3 & T\\ -1 & 1 \end{pmatrix} = \begin{pmatrix} R & T\\ S & P \end{pmatrix} $$[2]
which gives: $R=3, S=-1$
Thus we have (R, S, P, T) = (3, -1, 1, 6) which also follows the two required inequalities:
$$T > R > P > S \Leftrightarrow 6 > 3 > 1 > -1$$$$2R > T + S \Leftrightarrow 6 > 5$$[2]
(ii)
For $A, B$ to be valid we need:
$$ \begin{pmatrix} 2 & S\\ -2 & 1 \end{pmatrix} = \begin{pmatrix} R & S\\ T & P \end{pmatrix} $$which gives: $R=2, T=-2, P=1$
We immediately see that $R > T$ so this cannot be a Prisoner's Dilemma.
[4]
For $A, B$ to be valid we need:
$$ \begin{pmatrix} b - c & c\\ b & 1 \end{pmatrix} = \begin{pmatrix} R & S\\ T & P \end{pmatrix} $$thus $P=1$, $S=c$, $T=b$, $R=b-c$. The constraints imply: [1]
A theorem from the notes states the steady state probabilities of being in states $(CC, CD, DC, DD)$ is given by:
$$ \pi=(s_1s_2, s_1(1-s_2), (1-s_1)s_2, (1-s_1)(1-s_2)) $$Using this with the new utilities gives the result immediately.
[3]
For ((b, c)=(2, 1/2)), the given formula is:
$$ s_1s_2\times 3/2 + s1(1-s_2)/2 + (1-s_1)s_2 \times 2 + (1-s_1)(1-s_2) $$which gives:
$$ 3s_1s_2/2 + s1/2-s_1s_2/2 + 2s_2 - 2s_1s_2 + 1 -s_1 - s_2 + s_1s_2 = s_1s_2(3/2-1/2 - 2 + 1) + s_1(1/2-1) + s_2(2-1) + 1 $$which corresponds to:
$$ u = s_2 - s_1 / 2 + 1 $$[3]
In [12]:
s = sym.symbols("s:2")
expr = s[0] * s[1] * sym.S(3) / 2 + s[0] * (1 - s[1]) / 2 + (1 - s[0]) * s[1] * 2 + (1 - s[0]) * (1 - s[1])
expr.expand()
Out[12]:
If $q_1=q_2$ then:
$$ r_2 = 0 \Rightarrow s_1=q_2r_1+p_1 \text{ and }s_2=q_2 $$[1]
thus the utility is given by:
$$ - q_2r_1/2 - p_2/2+ q_2 + 1 $$[1]
This is maximised when the first two terms are 0, the player in question controls $r_1=p_1 - p_2$ and $p_2$ thus $p_1 = p_2 = 0$ is optimal behaviour.
This can also be seen intuitevly, if an opponent is playing independently of our actions then we should always defect.
[2]
The matrix $A$ correspods to the utility of a row player in a game where the row player is a given individual and the column player is the population.
This gives:
$$f_1=ax_1+bx_2\qquad f_2=cx_1+dx_2$$[1]
or equivalently:
$$f=Ax\qquad \phi=fx$$thus we have the same equation as before but in matrix notation:
$$\frac{dx}{dt}=x(f-\phi)$$[1]
Given a strategy vector $x=(x_1, x_2)$, some $\epsilon>0$ and another strategy $y=(y_1, y_1)$, the post entry population $x_{\epsilon}$ is given by:
$$ x_{\epsilon} = (x_1 + \epsilon(y_1 - x_1), x_2 + \epsilon(y_2 - x_2)) $$[2]
Given a stable population distribution, $x$ it represents an Evolutionary Stable Strategy (ESS) if and only if there exists $\bar\epsilon>0$:
$$u(x, x_{\epsilon})>u(y, x_{\epsilon})\text{ for all }0<\epsilon<\bar\epsilon, y$$[1]
where $u(x, y)$ corresponds to the fitness of strategy $x$ in population $y$ which is given by:
$$xAy^T$$[1]
Theorem:
If $x$ is an ESS, then for all $y\ne x$, either:
Conversely, if either (1) or (2) holds for all $y\ne x$ then $x$ is an ESS.
[2]
Proof:
If $x$ is an ESS, then by definition:
$$u(x,x_{\epsilon})>u(y,x_{\epsilon})$$which corresponds to:
$$(1-\epsilon)u(x,x)+\epsilon u(x,y)>(1-\epsilon)u(y,x)+\epsilon u(y,y)$$Conversely:
If $u(x,x) < u(y,x)$ then we can find $\epsilon$ sufficiently small such that the inequality is violated. [1]
If $u(x, x) = u(y,x)$ and $u(x,y) \leq u(y,y)$ then the inequality is violated. [1]
First step is to identify the Nash equilibria. Identify best responses for the associated two player game $(A, A^T)$:
$$ A=\begin{pmatrix} \underline{2}&\underline{3}\\ 0&1 \end{pmatrix}\qquad A^T=\begin{pmatrix} \underline{2}&0\\ \underline{3}&1 \end{pmatrix} $$[1]
This immediately gives a single symmetric pure Nash equilibria (as the first strategy dominates the second this is the only Nash equilibria):
$$ (\sigma_r, \sigma_c) = ((1, 0), (1, 0)) $$some code to verify this:
In [13]:
A = np.array([[2, 3], [0, 1]])
game = nash.Game(A, A.transpose())
list(game.support_enumeration())
Out[13]:
as this strategy obeys the first condition of the theorem it is the evolutionary stable strategy.
[4]